9 research outputs found

    Evaluation of machine-learning methods for ligand-based virtual screening

    Get PDF
    Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed

    Boosting Pose Ranking Performance via Rescoring with MM-GBSA

    No full text
    In a previous self-docking study, we have shown that structure reproduction performance can be improved by rescoring GOLD ChemPLP docking poses with the MM-GBSA scoring function. In this work, we attempt to better understand this improvement. We increase the size and diversity of the examined dataset, and perform self-docking using a curated set of over 700 complexes. The “scoring problem” (the inability to unambiguously identify the biologically most relevant pose) is assessed with respect to both MM-GBSA and ChemPLP scoring functions. Heavy atom root mean squared deviation (RMSD) values are used to compare the docked poses with the crystallographic ones. In addition to this standard metric, “partial matching” is introduced. This algorithm captures the visual observation that the majority of a ligand can be well docked but yet report a RMSD value of > 2.0 Å. Often this is attributable to arbitrary placements of flexible side chains in undefined solvent regions. The metrics introduced by this algorithm are applicable for assessing the contribution of ligand sampling to the scoring problem. It is shown that rescoring ChemPLP poses with the MM-GBSA scoring function improves pose ranking by better discriminating against non-cognate like poses. However, the key finding of this study, is that absolute rank is less important than the score between the docking poses. Thus, poses should not be retained solely on their ranks, but on the score difference relative to the best ranked pose

    MM/GBSA Binding Energy Prediction on the PDBbind Data Set: Successes, Failures, and Directions for Further Improvement

    No full text
    We validate an automated implementation of a combined Molecular Mechanics/ Generalized Born Surface Area (MM/GBSA) method (VSGB 2.0 energy model) on a large and diverse selection of protein-ligand complexes (855 complexes). Although this dataset is diverse with respect to both protein families and ligands, after carefully removing flawed structures, a significant correlation (R2 = 0.63) between calculated and experimental binding affinities is obtained. Consistent explanations for “outlier” complexes are found. Visual analysis of the crystal structures and recourse to the original literature reveal that neglect of explicit solvent, ligand strain, and entropy contribute to the under-, and overestimation of computed affinities. The limits of the Molecular Mechanics/ Implicit Solvent approach to accurately estimate protein-ligand binding affinities is discussed as is the influence of the quality of protein-ligand complexes on computed free energy binding values

    How to Computationally Stack the Deck for Hit-to-Lead Generation: In Silico Molecular Interaction Energy Profiling for De Novo siRNA Guide Strand Surrogate Selection

    No full text
    The Argonaute-2 protein is part of the RNA-induced silencing complex (RISC) and anchors the guide strand of the small interfering RNA (siRNA). The 3' end of the RNA contains two unpaired nucleotides (3'-overhang) that interact with the PAZ (PIWI-Argonaute-Zwille) domain of the protein. Theoretical and experimental evidence points towards a direct connection between the PAZ/3'-overhang binding affinity and siRNA's potency and specificity. Among the challenges to overcome when deploying siRNA molecules as therapeutics are their ready degradation under physiological conditions, and off-target effects. It has been demonstrated that nuclease resistance can be improved via replacement of the dinucleotide overhang by small molecules which retain the interactions of the RNA guide strand with the PAZ domain. Most commonly, nucleotide analogues are used to substitute the siRNA overhang. However, in this study we adopt a de novo approach to its modification. The X-ray structure of human Argonaute-2 PAZ domain served to perform virtual screening and molecular interaction energy profiling (i.e., decomposition of the force field calculated protein-ligand interaction energies) of tailored-to-purpose fragment libraries. The binding of fragments to the PAZ domain was validated experimentally by NMR spectroscopy. The in silico guided protocol led to the efficient discovery of a number of PAZ domain ligands with affinities comparable to that of a reference dinucleotide (UpU, Kd = 33 ”M). Originally starting from a generic fragment library, hits progress from 930 ”M down to 14 ”M within 3 iterations for the fragments selected via in silico molecular interaction energy profiling from a bespoke library. These dinucleotide siRNA guide strand surrogates represent potential new siRNA-based therapeutics (when attached to siRNA to form bioconjugates) featuring improved efficacy, specificity, stability and cellular uptake. This project yielded a portfolio of 7 patent applications, two of which have been granted to date

    Introducing the consensus modeling concept in genetic algorithms: application to interpretable discriminant analysis.

    No full text
    An evolutionary statistical learning method was applied to classify drugs according to their biological target and also to discriminate between a compilation of oral and nonoral drugs. The emphasis was placed not only on how well the models predict but also on their interpretability. In an enhancement to previous studies, the consistency of the model weights over several runs of the genetic algorithm was considered with the goal of producing comprehensible models. Via this approach, the descriptors and their ranges that contribute most to class discrimination were identified. Selecting a bin step size that enables the average descriptor properties of the class being trained to be captured improves the interpretability and discriminatory power of a model. The performance, consistency, and robustness of such models were further enhanced by using two novel approaches that reduce the variability between individual solutions: consensus and splice modeling. Finally, the ability of the genetic algorithm to discriminate between activity classes was compared with a similarity searching method, while naĂŻve Bayes classifiers and support vector machines were applied in discriminating the oral and nonoral drugs

    MM/GBSA Binding Energy Prediction on the PDBbind Data Set: Successes, Failures, and Directions for Further Improvement

    No full text
    We validate an automated implementation of a combined Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) method (VSGB 2.0 energy model) on a large and diverse selection of protein–ligand complexes (855 complexes). Although this data set is diverse with respect to both protein families and ligands, after carefully removing flawed structures, a significant correlation (<i>R</i><sup>2</sup> = 0.63) between calculated and experimental binding affinities is obtained. Consistent explanations for “outlier” complexes are found. Visual analysis of the crystal structures and recourse to the original literature reveal that neglect of explicit solvent, ligand strain, and entropy contribute to the under- and overestimation of computed affinities. The limits of the Molecular Mechanics/Implicit Solvent approach to accurately estimate protein–ligand binding affinities is discussed as is the influence of the quality of protein–ligand complexes on computed free energy binding values

    Discovery of novel indolinone-based, potent, selective and brain penetrant inhibitors of LRRK2

    No full text
    Mutations in leucine-rich repeat kinase-2 (LRRK2) are the most common genetic cause of Parkinson‘s disease (PD). The most frequent kinase-enhancing mutation is the G2019S residing in the kinase activation domain. This opens up a promising therapeutic avenue for drug discovery targeting the kinase activity of LRRK2 in PD. Several LRRK2 inhibitors have been reported to date. Here, we report a selective, brain penetrant LRRK2 inhibitor and demonstrate by a competition pulldown assay in vivo target engagement in mice
    corecore